Big data
Definitions '''Big data' Overview Big data is big in two different senses. It is big in the quantity and variety of data that are available to be processed. And, it is big in the scale of analysis (termed "analytics") that can be applied to those data, ultimately to make inferences and draw conclusions. By data mining and other kinds of analytics, non‐obvious and sometimes private information can be derived from data that, at the time of their collection, seemed to raise no, or only manageable, privacy issues. Such new information, used appropriately, may often bring benefits to individuals and society. Even in principle, however, one can never know what information may later be extracted from any particular collection of big data, both because that information may result only from the combination of seemingly unrelated data sets, and because the algorithm for revealing the new information may not even have been invented at the time of collection. The same data and analytics that provide benefits to individuals and society if used appropriately can also create potential harms — threats to individual privacy according to privacy norms both widely shared and personal. For example, large‐scale analysis of research on disease, together with health data from electronic medical records and genomic information, might lead to better and timelier treatment for individuals but also to inappropriate disqualification for insurance or jobs. GPS tracking of individuals might lead to better community‐based public transportation facilities, but also to inappropriate use of the whereabouts of individuals. The Three Vs A common framework for characterizing big data relies on the "three Vs," the volume, velocity, and variety of data, each of which is growing at a rapid rate as technological advances permit the analysis and use of this data in ways that were not possible previously. * Volume refers to the vast quantity of data that can be gathered and analyzed effectively. The costs of collecting and storing data continue to drop dramatically. And the ability to access millions of data points increases the predictive power of consumer data analysis. * Velocity is the speed with which companies can accumulate, analyze, and use new data. Technological improvements allow companies to harness the predictive power of data more quickly than ever before, sometimes instantaneously. * Variety means the breadth of data that companies can analyze effectively. Companies can now combine very different, once unlinked, kinds of data — either on their own or through data brokers or analytics firms — to infer consumer preferences and predict consumer behavior, for example. Together, the three Vs allow for more robust research and correlation. Previously, finding a representative data sample sufficient to produce statistically significant results could be very difficult and expensive. Today, the present scope and scale of data collection enables cost-effective, substantial research of even obscure or mundane topics (e.g., the amount of foot traffic in a park at different times of day)." Sources of big data The sources and formats of data continue to grow in variety and complexity. A partial list of sources includes the public web; social media; mobile applications; federal, state and local records and databases; commercial databases that aggregate individual data from a spectrum of commercial transactions and public records; geospatial data; surveys; and traditional offline documents scanned by optical character recognition into electronic form. The advent of the more Internet-enabled devices and sensors expands the capacity to collect data from physical entities, including sensors and radio-frequency identification (RFID) chips. Personal location data can come from GPS chips, cell-tower triangulation of mobile devices, mapping of wireless networks, and in-person payments. References Sources * "Overview" section: Big Data and Privacy: A Technological Perspective, at ix-x. * "Sources of big data" section: Big Data: Seizing Opportunities, Preserving Values, at 5. * "The Three Vs" section: Big Data: A Tool for Inclusion or Exclusion?: Understanding the Issues, at 1-2. See also * 3Vs * Big data analytics * Big Data and Privacy: A Technological Perspective * Big data paradigm * Big Data Means Big Opportunities and Big Challenges: Promoting Financial Inclusion and Consumer Protection in the “Big Data” Financial Era * Big Data Research and Development Initiative * Big Data Senior Steering Group * Big Data: A Tool for Inclusion or Exclusion?: Understanding the Issues * Big Data: A Report On Algorithmic Systems, Opportunity, and Civil Rights * Big Data: A Tool for Inclusion or Exclusion? (Workshop) * Big Data: Seizing Opportunities, Preserving Values * Big Data: Seizing Opportunities, Preserving Values: Interim Progress Report * Big Data: The Next Frontier for Innovation, Competition, and Productivity * Delivering on the Promise of Big Data and the Cloud * The Promise and Peril of Big Data External resources * "Data, Data Everywhere, A Special Report on Managing Information," The Economist (Feb. 25, 2010) (full-text). * "Dealing with Data," Science (special issue) (Feb. 11, 2011) (full-text). * Robert Kirkpatrick, "Beyond Targeted Ads: Big Data for a Better World" (2012) (full-text). * Jules Polonetsky & Omer Tene, "Privacy and Big Data: Making Ends Meet," 66 Stan. L. Rev. Online 25 (2013) (full-text). * Edith Ramirez, "The Privacy Challenges of Big Data: A View from the Lifeguard's Chair," Keynote Address by FTC Chairwoman Edith Ramirez (Technology Policy Institute Aspen Forum) (Aug. 19, 2013) (full-text). Category:Definition Category:Data Category:Privacy